342 PART 6 Analyzing Survival Data
Assessing goodness-of-fit and predictive
ability of the model
There are several measures of how well a regression model fits the survival data.
These measures can be useful when you’re choosing among several different
models:»
» Should you include a possible predictor variable (like age) in the model?»
» Should you include the squares or cubes of predictor variables in the model
(meaning including age2 or age3 in addition to age)?»
» Should you include a term for the interaction between two predictors?
Your software may offer one or more of the following goodness-of-fit measures:»
» A measure of agreement between the observed and predicted outcomes
called concordance (see the bottom of Figure 23-4). Concordance indicates the
extent to which participants with higher predicted hazard values had shorter
observed survival times, which is what you’d expect. Figure 23-4 shows a
concordance of 0.642 for this regression.»
» An r (or r2) value that’s interpreted like a correlation coefficient in ordinary
regression, meaning the larger the r2 value, the better the model fits the data.
In Figure 23-4, r2 (labeled Rsquare) is 0.116.»
» A likelihood ratio test and associated p value that compares the full model,
which includes all the parameters, to a model consisting of just the overall
baseline function. In Figure 23-4, the likelihood ratio p value is shown as
4 46
06
.
e
, which is scientific notation for p 0.00000446, indicating a model
that includes the CenterCD and Radiation variables can predict survival
statistically significantly better than just the overall (baseline) survival curve.»
» Akaike’s Information Criterion (AIC) is especially useful for comparing alternative
models but is not included in Figure 23-4.
Focusing on baseline survival
and hazard functions
The baseline survival function is represented as a table with two columns — time
and predicted survival — and a row for each distinct time at which one or more
events were observed.